11 research outputs found

    Outcome prediction based on microarray analysis: a critical perspective on methods

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Information extraction from microarrays has not yet been widely used in diagnostic or prognostic decision-support systems, due to the diversity of results produced by the available techniques, their instability on different data sets and the inability to relate statistical significance with biological relevance. Thus, there is an urgent need to address the statistical framework of microarray analysis and identify its drawbacks and limitations, which will enable us to thoroughly compare methodologies under the same experimental set-up and associate results with confidence intervals meaningful to clinicians. In this study we consider gene-selection algorithms with the aim to reveal inefficiencies in performance evaluation and address aspects that can reduce uncertainty in algorithmic validation.</p> <p>Results</p> <p>A computational study is performed related to the performance of several gene selection methodologies on publicly available microarray data. Three basic types of experimental scenarios are evaluated, i.e. the independent test-set and the 10-fold cross-validation (CV) using maximum and average performance measures. Feature selection methods behave differently under different validation strategies. The performance results from CV do not mach well those from the independent test-set, except for the support vector machines (SVM) and the least squares SVM methods. However, these wrapper methods achieve variable (often low) performance, whereas the hybrid methods attain consistently higher accuracies. The use of an independent test-set within CV is important for the evaluation of the predictive power of algorithms. The optimal size of the selected gene-set also appears to be dependent on the evaluation scheme. The consistency of selected genes over variation of the training-set is another aspect important in reducing uncertainty in the evaluation of the derived gene signature. In all cases the presence of outlier samples can seriously affect algorithmic performance.</p> <p>Conclusion</p> <p>Multiple parameters can influence the selection of a gene-signature and its predictive power, thus possible biases in validation methods must always be accounted for. This paper illustrates that independent test-set evaluation reduces the bias of CV, and case-specific measures reveal stability characteristics of the gene-signature over changes of the training set. Moreover, frequency measures on gene selection address the algorithmic consistency in selecting the same gene signature under different training conditions. These issues contribute to the development of an objective evaluation framework and aid the derivation of statistically consistent gene signatures that could eventually be correlated with biological relevance. The benefits of the proposed framework are supported by the evaluation results and methodological comparisons performed for several gene-selection algorithms on three publicly available datasets.</p

    Using a Single Neuron as a Marker Selector - A Breast Cancer Case Study

    No full text
    The problem of marker selection in DNA microarray analysis has been mostly addressed by linear methods. RFE-SVM is such a representative method where a linear kernel is used as the basic tool to address the problem. On the other hand a single neuron is known to be a linear estimator. In this study we explore such a single neuron to address the problem of marker selection. © 2007 IEEE

    The linear neuron as marker selector and clinical predictor in cancer gene analysis

    No full text
    Summarization: The problem of gene selection has been extensively studied in a number of scientific works using various kinds of methods. However, the application of a linear neuron is a novel approach possessing several advantages. In this work we propose to study the behavior of such a linear neuron, appropriately adapted and trained to the problem of gene selection in the DNA microarray experiment.Παρουσιάστηκε στο: Computer methods and programs in biomedicin

    A proposal for gene Signature integration

    No full text
    Summarization: Gene expression patterns that can distinguish to a clinically significant degree disease subclasses not only play a prominent role in diagnosis but also lead to therapeutic strategies that tailor treatment to the particular biology of each disease. Nevertheless, gene expression signatures derived through statistical feature identification procedures on population datasets have received rightful criticism, since they share only few genes in common for a particular pathology, even if they derived from the same dataset using different methodologies. An optimistic view to this problem emerging from the wealth of biological interactions is that a statistical solution may not be unique. The derived signatures may be complementary parts of a global one, with each individual signature intersecting only a small part of biological evidence. In this work we focus on the biological knowledge hidden behind different gene signatures and propose a methodology for integrating such knowledge towards retrieving a unified signature.Παρουσιάστηκε στο: 9th International Conference on nformation Technology and Applications in Biomedicin

    Comparison and unification of genomic signatures in breast cancer

    No full text
    Summarization: The concept of deriving a gene signature in breast cancer has been addressed by different research groups, each one proposing a different solution with minor overlap among them. There is still an open issue of unifying results among different research groups. In this study we evaluate two published signatures, namely the 70 gene signature of Netherlands group and a 57 gene signature published in our previous study and propose an evaluation platform under which the underlined signatures could be compared effectively. After such an evaluation, we proceed with a unified signature and assess its performance with improved efficiency over the initial signatures.Presented on

    Revealing significant biological knowledge via gene ontologies and pathways

    No full text
    Summarization: Many scientific works in the field of bioinformatics and marker selection deal with the problem of deriving a gene signature with significant statistical properties without paying much attention on the biological aspect of the produced result. In this paper we asses the problem of revealing possible significant knowledge which might be hidden under a given gene signature, using previous biological information provided through gene ontologies and pathways.Presented on

    Wrapper filtering criteria via linear neuron and kernel approaches

    No full text
    Summarization: The problem of marker selection in DNA microarray analysis has been addressed so far by two basic types of approaches, the so-called filter and wrapper methods. Wrapper methods operate in a recursive fashion where feature (gene) weights are re-evaluated and dynamically changing from iteration to iteration, while in filter methods feature weights remain fixed. Our objective in this study is to show that the application of filter criteria in a recursive fashion, where weights are potentially adjusted from cycle to cycle, produces noticeable improvement on the generalization performance measured on independent test sets.Presented on: Computers in Biology and Medicin

    Identification of significant metabolic markers from MRSI data for brain cancer classification

    No full text
    Summarization: Investigation of the significance of metabolites peak area ratios derived from brain magnetic resonance spectroscopic imaging (MRSI) spectra, in brain tumors classification, has been applied. Results have shown that in most binary classifications using SVM and LSSVM classifiers, the accuracy achieved was greater than 0.90 AUC except the case of Gliomas grade 2 vs Gliomas grade 3 where 0.84 AUC was recorded due to the great heterogeneity of these two types of tumor. The minimum but also biologically significant set of features (markers), where maximum AUCs recorded, was derived. Ratios of N-acetyl-aspartate, choline, creatine and lipids metabolites found to play the most crucial role in brain tumors discrimination. The biological importance of these markers was also verified by literature. Finally the influence of four magnetic resonance image (MRI) intensities on the classification process was also measured. It was found that MRI data do not improve significantly the classification accuracies.Presented on

    Integration of gene signatures using biological knowledge

    No full text
    Summarization: Gene expression patterns that distinguish clinically significant disease subclasses may not only play a prominent role in diagnosis, but also lead to the therapeutic strategies tailoring the treatment to the particular biology of each disease. Nevertheless, gene expression signatures derived through statistical feature-extraction procedures on population datasets have received rightful criticism, since they share few genes in common, even when derived from the same dataset. We focus on knowledge complementarities conveyed by two or more gene-expression signatures by means of embedded biological processes and pathways, which alternatively form a meta-knowledge platform of analysis towards a more global, robust and powerful solution.Presented on: Artificial Intelligence in Medicin
    corecore